Search results for "random forests"

showing 10 items of 10 documents

A Methodology to Derive Global Maps of Leaf Traits Using Remote Sensing and Climate Data

2018

This paper introduces a modular processing chain to derive global high-resolution maps of leaf traits. In particular, we present global maps at 500 m resolution of specific leaf area, leaf dry matter content, leaf nitrogen and phosphorus content per dry mass, and leaf nitrogen/phosphorus ratio. The processing chain exploits machine learning techniques along with optical remote sensing data (MODIS/Landsat) and climate data for gap filling and up-scaling of in-situ measured leaf traits. The chain first uses random forests regression with surrogates to fill gaps in the database (> 45% of missing entries) and maximizes the global representativeness of the trait dataset. Plant species are then a…

0106 biological sciencesFOS: Computer and information sciences010504 meteorology & atmospheric sciencesSpecific leaf areaClimateBos- en LandschapsecologieSoil ScienceFOS: Physical sciencesApplied Physics (physics.app-ph)010603 evolutionary biology01 natural sciencesStatistics - ApplicationsGoodness of fitAbundance (ecology)Machine learningForest and Landscape EcologyApplications (stat.AP)Computers in Earth SciencesPlant ecologyVegetatie0105 earth and related environmental sciencesRemote sensingMathematics2. Zero hungerPlant traitsVegetationData stream miningClimate; Landsat; Machine learning; MODIS; Plant ecology; Plant traits; Random forests; Remote sensing; Soil Science; Geology; Computers in Earth SciencesGlobal MapRegression analysisGeologyPhysics - Applied Physics15. Life on landRandom forestsRemote sensingPE&RCRandom forestMODISTraitVegetatie Bos- en LandschapsecologieVegetation Forest and Landscape EcologyLandsat

researchProduct

A Methodological Framework to Discover Pharmacogenomic Interactions Based on Random Forests

2021

The identification of genomic alterations in tumor tissues, including somatic mutations, deletions, and gene amplifications, produces large amounts of data, which can be correlated with a diversity of therapeutic responses. We aimed to provide a methodological framework to discover pharmacogenomic interactions based on Random Forests. We matched two databases from the Cancer Cell Line Encyclopaedia (CCLE) project, and the Genomics of Drug Sensitivity in Cancer (GDSC) project. For a total of 648 shared cell lines, we considered 48,270 gene alterations from CCLE as input features and the area under the dose-response curve (AUC) for 265 drugs from GDSC as the outcomes. A three-step reduction t…

0301 basic medicineRandom ForestsPharmacogenomic Variantsdrug responseGenomicsComputational biologycell linesBiologyQH426-470Article03 medical and health sciences0302 clinical medicineNeoplasmsDrug responseGeneticsHumanscancerGene Regulatory Networksgenomic alterationGenetics (clinical)Random Forestcell linegenomic alterationsTumor tissueRandom forestpharmacogenomic interactions030104 developmental biologyConcordance correlation coefficientDrug Resistance Neoplasm030220 oncology & carcinogenesisPharmacogenomicsIdentification (biology)pharmacogenomic interactions.Cancer cell linesAlgorithmsGenome-Wide Association StudyGenes

researchProduct

Démarche statistique pour la sélection des indicateurs par Random Forests pour la surveillance de la qualité des sols

2013

The volume of data, and the large number of biological variables to be tested (one hundred), require analytical techniques, such asRandom Forests, which can overcome the problem of multi-colinearity for the selection of indicators, sensitive to various factors.Random Forests methodology is appropriate for the selection of the most discriminant variables. So, we searched for the best wayto select them, by bringing together all biological variables, representing the Microflora and Fauna. This approach focuses on impactindicators from the Bio2 program, indicators of flora and indicators of accumulation (snails) were not included.This work has been implemented on the three factors of discrimina…

researchProduct

Classification of Melanoma Lesions Using Sparse Coded Features and Random Forests

2016

International audience; Malignant melanoma is the most dangerous type of skin cancer, yet it is the most treatable kind of cancer, conditioned by its early diagnosis which is a challenging task for clinicians and dermatologists. In this regard, CAD systems based on machine learning and image processing techniques are developed to differentiate melanoma lesions from benign and dysplastic nevi using dermoscopic images. Generally, these frameworks are composed of sequential processes: pre-processing, segmentation, and classification. This architecture faces mainly two challenges: (i) each process is complex with the need to tune a set of parameters, and is specific to a given dataset; (ii) the…

Computer scienceSparse codingComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISIONScale-invariant feature transformImage processingDermoscopy02 engineering and technology[ SPI.SIGNAL ] Engineering Sciences [physics]/Signal and Image processing030218 nuclear medicine & medical imaging03 medical and health sciences0302 clinical medicineHistogram0202 electrical engineering electronic engineering information engineeringmedicineComputer visionSegmentationMelanoma[SPI.SIGNAL] Engineering Sciences [physics]/Signal and Image processingbusiness.industryMelanomaCancerPattern recognitionImage segmentationSparse approximationRandom forestsmedicine.diseaseClassificationRandom forest020201 artificial intelligence & image processingArtificial intelligenceSkin cancerNeural codingbusiness[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing

researchProduct

Growing stock volume from multi-temporal landsat imagery through google earth engine

2019

Growing stock volume (GSV) is one of the most important variables for.forest management and is traditionally- estimated from ground measurements. These measurements are expensive and therefore sparse and hard to maintain in time on a regular basis. Remote sensing data combined with national forest inventories constitute a helpful tool to estimate and map forest attributes. However, most studies on GSV estimation from remote sensing data focus on small forest areas with a single or only a few species. The current study aims to map GSV in peninsular Spain, a rather large and very heterogeneous area. Around 50 000 wooded land plots from the Third Spanish National Forest Inventory (NFI3) were u…

Global and Planetary ChangeMean squared errorGrowing stock volumeForest managementManagement Monitoring Policy and LawReflectivityRandom forestSpainMulticollinearityEnvironmental scienceShort wave infraredComputers in Earth SciencesGuided regularized random forestsGoogle Earth EngineLandsatImage resolutionStock (geology)Earth-Surface ProcessesRemote sensingInternational Journal of Applied Earth Observation and Geoinformation

researchProduct

Robust estimation of mean electricity consumption curves by sampling for small areas in presence of missing values

2017

In this thesis, we address the problem of robust estimation of mean or total electricity consumption curves by sampling in a finite population for the entire population and for small areas. We are also interested in estimating mean curves by sampling in presence of partially missing trajectories.Indeed, many studies carried out in the French electricity company EDF, for marketing or power grid management purposes, are based on the analysis of mean or total electricity consumption curves at a fine time scale, for different groups of clients sharing some common characteristics.Because of privacy issues and financial costs, it is not possible to measure the electricity consumption curve of eac…

Linear mixed modelsSmall area estimationMissing dataRegression treesEstimation sur petits domaines[MATH.MATH-GM] Mathematics [math]/General Mathematics [math.GM]Estimateurs à noyauModèles linéaires mixtesRandom forestsBiais conditionnelsFunctional dataSurvey sampling[MATH.MATH-GM]Mathematics [math]/General Mathematics [math.GM]RobustesseDonnées fonctionnellesPlus proches voisinsForêts aléatoiresConditional biasKernel estimatorsNearest neighboursSondageDonnées manquantesRobustnessArbres de régression

researchProduct

Global Estimation of Biophysical Variables from Google Earth Engine Platform

2018

This paper proposes a processing chain for the derivation of global Leaf Area Index (LAI), Fraction of Absorbed Photosynthetically Active Radiation (FAPAR), Fraction Vegetation Cover (FVC), and Canopy water content (CWC) maps from 15-years of MODIS data exploiting the capabilities of the Google Earth Engine (GEE) cloud platform. The retrieval chain is based on a hybrid method inverting the PROSAIL radiative transfer model (RTM) with Random forests (RF) regression. A major feature of this work is the implementation of a retrieval chain exploiting the GEE capabilities using global and climate data records (CDR) of both MODIS surface reflectance and LAI/FAPAR datasets allowing the global estim…

random forestsCWC010504 meteorology & atmospheric sciencesMean squared errorScience0211 other engineering and technologiesGoogle Earth Engine; LAI; FVC; FAPAR; CWC; plant traits; random forests; PROSAIL02 engineering and technologyLand cover01 natural sciencesAtmospheric radiative transfer codesRange (statistics)Parametrization (atmospheric modeling)FAPARLeaf area index021101 geological & geomatics engineering0105 earth and related environmental sciencesRemote sensingPROSAILQ15. Life on landFVCLAIRandom forestplant traits13. Climate actionPhotosynthetically active radiationGeneral Earth and Planetary SciencesEnvironmental scienceGoogle Earth EngineRemote Sensing; Volume 10; Issue 8; Pages: 1167

researchProduct

Performance Dissimilarities in European Union Manufacturing: The Effect of Ownership and Technological Intensity

2021

Our paper addresses the relevance of a set of continuous and categorical variables that describe industry characteristics to differences in performance between foreign versus locally owned companies in industries with dissimilar levels of technological intensity. Including data on manufacturing sector performance from 20 European Union member countries and covering the 2009–2016 period, we used the random forests methodology to identify the best predictors of EU manufacturing industries’ a priori classification based on two main attributes: ownership (foreign versus local) and technological intensity. We found that EU foreign-owned businesses dominate locally owned ones in terms of size, wh…

random forestsGeography Planning and DevelopmentTJ807-830Management Monitoring Policy and LawTD194-195Eu countriesRenewable energy sourcesManufacturingmedia_common.cataloged_instanceGE1-350European UnionEuropean unionCategorical variablehigh-tech industriesIndustrial organizationmedia_commonEnvironmental effects of industries and plantsRenewable Energy Sustainability and the Environmentbusiness.industryHigh techEnvironmental sciencesMultinational corporationforeign investorsCash flowbusinessperformanceIntensity (heat transfer)Sustainability

researchProduct

Application of selected methods of black box for modelling the settleability process in wastewater treatment plant

2017

The paper described how the results of measurement s of inflow wastewater temperature in the chamber, a degree of external and internal recirculation in the biological-mechanical wastewater treatment plan t (WWTP) in Cedzyna near Kielce, Poland, were used to make predictions of settleability of activated sludge. Three methods,namely: multivariate adaptive regression splines (MARS), random forests (RF) and modified random forests (RF+ SOM) were employed to compute activated sludge settleability. The results of analysis indicate that modified random forests demonstrate the best predictive abilities.

random forestsmodified random forestssludge settleabilitymultivariate adaptive regression splinesEcological Chemistry and Engineering S-Chemia I Inzynieria Ekologiczna S

researchProduct

Application of selected supervised classification methods to bank marketing campaign

2016

Supervised classification covers a number of data mining methods based on training data. These methods have been successfully applied to solve multi-criteria complex classification problems in many domains, including economical issues. In this paper we discuss features of some supervised classification methods based on decision trees and apply them to the direct marketing campaigns data of a Portuguese banking institution. We discuss and compare the following classification methods: decision trees, bagging, boosting, and random forests. A classification problem in our approach is defined in a scenario where a bank’s clients make decisions about the activation of their deposits. The obtained…

random forestsr projectclassificationdecision treesboostingdata miningbank marketingbaggingsupervised learningInformation Systems in Management = Systemy Informatyczne w Zarządzaniu

researchProduct